Studying the Performance of the Jellyfish Search Optimiser for the Application of Projection Pursuit

H. Sherry Zhang

University of Texas at Austin

2024-09-04

Optimisation in projection pursuit

  • Data: \(\mathbf{X}_{n \times p}\); Basis: \(\mathbf{A}_{p\times d}\)
  • Projection: \(\mathbf{Y} = \mathbf{X} \cdot \mathbf{A}\)
  • Index function: \(f: \mathbb{R}^{n \times d} \mapsto \mathbb{R}\)
  • Optimisation: \(\arg \max_{\mathbf{A}} f(\mathbf{X} \cdot \mathbf{A}) ~~~ s.t. ~~~ \mathbf{A}^{\prime} \mathbf{A} = I_d\)
  • 5 vars (\(x_1\) - \(x_5\)), 1000 obs simulated
    • One variable (\(x_2\)) is a mixture normal
    • others are random normal
  • 1D projection using the holes index: \(\propto 1 -\frac{1}{n} \sum_{i = 1}^n \exp(-\frac{1}{2} y_i y_i^{\prime})\)

Motivation

The work also reveals inadequacies in the tour optimization algorithm, that may benefit from newly developed techniques and software tools. Exploring this area would help improve the guided tours. As new optimization techniques become available, adapting these to the guided tour would extend the technique to a broader range of problems. (Laa & Cook, 2020)

Continuation of the previous work

The Jellyfish search optimiser

Chou, J. S., & Truong, D. N. (2021). A novel metaheuristic optimizer inspired by behavior of jellyfish in ocean. Applied Mathematics and Computation, 389, 125535.

A diagram to explain the Jellyfish search optimiser

A snapshot of the optimiser in the R code (maybe need to explain the tourr code??)

An animation of JSO in action

Properties proposed

✅ smoothness, ✅ squintability, ❌ flexibility, ❌ rotation invariance, ✅ speed

Smoothness

what does it mean by smooth and not smooth

Squintability

A small squint angle because you have to be very close to the optimal projection plane to be able to see the structure of the data.

(we see a hill over there! Now we see a hill)

Define Squintability

Projection distance between two bases \(A\) and \(A^*\), \(d(A, A^*)\):

\[d(A, A^*) = \lVert AA^\prime - A^*A^{*\prime}\ \rVert _F\]

where \(\lVert . \rVert _F\) denotes the Frobenius norm, given by \(\lVert M \rVert _F = \sqrt{\sum_{ij} M_{ij}^2}\).


Index-distance curve \(g\) maps \(d(A, A^*)\) to the index value \(f(XA)\), such that \[g(d(A, A^*)) = f(XA)\]

Squintability:

\[\varsigma(f) = -c \times \max_{d} g'(d) \times \arg \max_{d} g'(d)\]

use c = 4 to be consistent with estimating with parametric model

Calculate squintability

sigmoid curve: \(\ell(x):=\frac{1}{1+\exp(\theta_{3}(x-\theta_{2}))}\ \)

parametric model: \(f(x)=(\theta_{1}-\theta_{4})\frac{\ell(x)-\ell(x_{\max})}{\ell(0)-\ell(x_{\max})}+\theta_{4}\ \)

Squintability: \(\varsigma=\frac{(\theta_{1}-\theta_{4})\theta_{2}\theta_{3}}{\ell(0)-\ell(x_{\max})}\)

Example:

shall I do a specific calculation example here?

Simulation setup

the “pipe” shape:

the holes index

data dimension d = 6, 8, 10, 12

  • different JSO hyper-parameters:

    • number of jellyfishes (20, 50, 100)
    • maximum number of tries (50, 100)

the “sine” shape:

index function: “MIC”, “TIC”, “dcor2d”, “loess2d”, “splines2d”, “stringy”

data dimension d = 6

For “MIC” and “TIC”, d = 8 is also included

Sucess rate

50 repetitions for each case to calculate success rate

xxx out of 50 that finds a final index value within 0.05 of the best index value found among all 50 simulations

Data for modelling

calculate smoothness, squintability, as well as average speed for each case

In total 52 observations

index d smoothness squintability n. jellyfish max. tries success rate time (sec)
MIC 6 2.46 1.26 20 50 0.12 2.48
MIC 6 2.46 1.26 20 100 0.24 8.95
MIC 6 2.46 1.26 50 50 0.52 5.65
MIC 6 2.46 1.26 50 100 0.64 13.22
MIC 6 2.46 1.26 100 50 0.76 19.45
MIC 8 2.64 0.77 20 50 0.08 2.57
MIC 8 2.64 0.77 20 100 0.08 4.96

Results

🔗